Skip to content

Conversation

@p1k0pan
Copy link
Contributor

@p1k0pan p1k0pan commented Jan 25, 2026

After training Qwen3-VL-8B with Megatron, it was unable to convert torch_dist to hf. Adding convert code.
Tested on Qwen3-VL-8B, not sure whether suitable for Qwen3-VL moe model.

@zhuzilin zhuzilin merged commit 79012cb into THUDM:main Jan 28, 2026
@weixiao-zhan
Copy link

Hi, the file name might need to be qwen3_vl.py.

(head, rank=0, pid=40184)     from .qwen3_vl import convert_qwen3vl_to_hf
(head, rank=0, pid=40184) ModuleNotFoundError: No module named 'slime.backends.megatron_utils.megatron_to_hf.qwen3_vl'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants